Polymorphisms in Cha o 1 and Cha o 2, major allergens of Japanese cypress (Chamaecyparis obtusa) pollen from a restricted region in Japan

Japanese cedar pollinosis is a major seasonal allergy in Japan, and Japanese cypress pollinosis is a growing concern because the cypress pollen season follows the cedar pollen season and cross-reactivity among allergens occurs between these closely related species. Allergens purified from pollen under unspecified collecting conditions can potentially heterogenous allergens profiles and batch to batch variability, and amino acid sequence variants in allergens possibly exist among trees. Polymorphisms have not been investigated for the cypress pollen major allergens, Cha o 1 and Cha o 2. Our aim was to examine the homogeneity of allergen amino acid sequences. DNA sequences of Cha o 1 and Cha o 2 from pollen collected from Chiba and Ibaraki prefectures and from needles of 47 plus trees located at seed orchards in Chiba Prefecture were examined by amplicon sequencing and amino acid substitutions were deduced. Sequence analysis of the pollen samples revealed that eight and seven residues of Cha o 2 were polymorphic, respectively. Thirteen residues in Cha o 2, including those residues identified in pollen, were deduced to be polymorphic for the plus trees. Cha o 2 expressed by the 47 plus trees included amino acid differences when compared with that of isoallergen Cha o 2.0101. No substitution was deduced in Cha o 1 for pollen taken from the two prefectures. One conservative amino acid substitution was deduced in Cha o 1 for the plus trees. Of the 47 plus trees examined, 38 were deduced to express only the isoallergen Cha o 1.0101 isoform, whereas eight trees were heterozygous and a single tree was homozygous for the non-synonymous mutation, which indicates relative uniformity of Cha o 1. Cha o 2 was found to be a heterogeneous allergen which suggests that studies using pollen from different trees may not give the same results.


Introduction
Cupressaceae is a widely distributed conifer family whose pollen causes allergic diseases [1]. In the Mediterranean region, Arizona cypress (Cupressus arizonica) and Mediterranean cypress (Cupressus sempervirens) release pollen that cause allergic reactions, and a major allergen from was essentially uniform with only one amino acid substitution identified. Conversely, Cha o 2 was found to have a variety of amino acid substitutions with significant frequencies. Composition and substitution frequencies in Cha o 2 deduced from pollen collected from Chiba Prefecture was partially different when compared with that collected from Ibaraki Prefecture. We raise the possibility that different source materials include different frequencies of Cha o 2 isoforms. We also discuss the possibility that Cha o 2 bearing a specific amino acid substitution located in a B or T cell epitope may be a useful tool for examining the epitope.

Needle and pollen DNA
Needles were sampled from 47 Chamaecyparis obtusa plus trees at the Sanmu (35.64152N/ 140.37328E) and Ohtaki (35.18267N/ 140.23231E) seed orchards of Chiba Prefectural Agriculture and Forestry Research Center with permission of Chiba Prefecture. Pollen collected from 18 Chamaecyparis obtusa trees grown in an artificial forest located in Chiba Prefecture was purchased (alpha-Pinene, Katori, Chiba, Japan). Collected pollen were mixed and stored at -20˚C. Chamaecyparis obtusa pollen from Ibaraki Prefecture was purchased (Yamizo pollen study group, Kuji, Ibaraki, Japan). Pollen was examined by microscopy to confirm morphologically no contamination of pollen from other species. DNA was extracted from needles and pollen using the DNeasy-Plant Mini kit (Qiagen, Hilden, Germany). Pollen was grounded with a mortar and pestle, whereas needles were minced and homogenized (disposable generator, Hard Tissue Tip, TH, OMNI Inc., Kennesaw, GA, USA) before DNA extraction.

Library preparation and next-generation sequencing (NGS)
Cha o 1 or Cha o 2 genomic DNA sequences were amplified by PCR using Gflex (Takara Bio Inc., Kusatsu, Shiga, Japan). PCR was performed by 40 cycles at 98˚C for 10 s, 60˚C for 15 s, and 68˚C for 30 s with a thermal cycler. For Cha o 1, forward (5'-cctgtaccttattagca gtccttgttttcc-3' or 5'-ctgctccctaaaaatggcttcc-3') and reverse (5'-aga tagtacgacattctatattttgtggatca-3' or 5'-agtccaacaatatacatcaca-3') primers were based on the Cha o 1 cDNA sequence (D45404) and used in PCR. For Cha o 2, forward (5'-tatcgaacaatatcttagatcgaatcggagttt-3') and reverse (5'-tgg atgaacaataattaactttggcttgcatgg-3') primers were based on the Cha o 2 genomic DNA sequence (AB211843.1) and the Cha o 2 cDNA sequence (Mori et al., 1999; data have not been deposited in a public database), respectively. PCR products were purified using the QiaQUICK PCR purification kit (Qiagen). The sequencing library was prepared from mixed PCR products using the Nextera XT DNA Library Prep kit (Illumina, San Diego, CA, USA) with the Nextera XT DNA Library Index kit (Illumina). For individuals whose original clones had not been confirmed at the molecular level, needles from three stands for each ID were obtained and equal amounts of amplified PCR products from these individuals were pooled. If more than three alleles were detected by NGS, the two major alleles were considered as true alleles for the plus trees in this work. We obtained paired-end libraries that were pooled and run simultaneously on an Illumina Miseq sequencing system by using the Miseq reagent kit v3 of the 150 bp paired-end (Illumina). Nucleotide sequence data are available from the DDBJ Sequenced Read Archive under accession numbers DRX249994-DRX250043.

Data analysis
Indexed sequences were de-multiplexed and analyzed individually. Read cleaning, mapping and variant calling were performed by Genedata Expressionist for Genomic Profiling ver.

PCR-Restriction Fragment Length Polymorphism (RFLP)
A partial Cha o 1 genomic DNA sequence was amplified by PCR using Gflex (Takara Bio Inc.) and a pair of DNA primers with the following sequences: forward (5'-tgtagggcacga tatggactt-3') and reverse (5'-tgcatgttagaaccccagca-3'). The primers were based on the Cha o 1 genomic DNA sequence (LC586088). Amplification was performed by 40 cycles at 98˚C for 10 s, 60˚C for 15 s, and 68˚C for 30 s with a thermal cycler. The PCR product was purified using the QIAEX II Gel Extraction kit (Qiagen) and restricted by PsiI (NEB, Ipswich, MA, USA). The length of the restricted PCR product was examined by TapeStation and D1000 ScreenTape instruments (Agilent, Santa Clara, CA, US).

Cha o 2 allergen polymorphisms in pollen
We explored Cha o 2 allergen polymorphism in pollen collected from artificial forests located in Chiba and Ibaraki prefectures. DNA sequences of Cha o 2 amplified from pollen DNA were examined by NGS. The genomic DNA sequence of Cha o 2 (AB211843.1) isolated from Uwajima 6 [22] was used as a reference, although the amino acid sequence (BAF32143) deduced from AB211843.1 differed by three amino acids when compared with that of isoallergen Cha o 2.0101 (Fig 1) [15,24]. Pollen was collected from 18 trees in Chiba Prefecture. Chamaecyparis obtusa is diploid and Cha o 2 is probably a single copy gene. Thus, maximum theoretical number of isoforms possible in the Chiba pollen is 36 (i.e., 18 trees multiplied by 2). The number of trees used for collecting pollen from Ibaraki Prefecture was unknown. No nonsense mutations were present. Sequencing analysis of pollen collected from Chiba Prefecture revealed eight amino acid substitutions (H3Q, E29A, A31T, V44L, F52Y, N250D, S274R and E390K) in Cha o 2 whereas seven (H3Q, E29A, A31T, V44L, F52Y, N250D and T376K) amino acid substitutions in Cha o 2 were identified for pollen collected from Ibaraki Prefecture (Fig 2). Six (H3Q, E29A, A31T, V44L, F52Y and N250D) of these substitutions were found in pollen from both prefectures, and the frequencies of their reads were similar between the two sources (Fig 2). Information on haplotypes of Cha o 2 and the frequency of the isoforms were not obtained because our analysis used short reads of 150 bp.  (Table 1), and their average frequencies in the 47 trees ranged between 0.01 and 0.77 (Fig 2). A no nonsense mutation was detected. All Cha o 2 substitutions deduced in pollen from Chiba and Ibaraki prefectures were also found in the plus trees. No plus tree was found to be homozygous for the Cha o 2 allele encoding isoallergen Cha o 2.0101. Possible heterozygous plus  [15,16]. The three residues that differed in BAF32143 [22] are indicated and other residues are shown as '.'. Undetermined residues are shown as '-'. Substitutions discovered in this study are shown below the BAF32143 sequence.

Cha o 2 DNA sequencing results from plus tree needles
https://doi.org/10.1371/journal.pone.0261327.g001 trees for the allele encoding Cha o 2.0101 were found (e.g., Kuno 2 in Table 1). Three plus trees, Miho 6, Naka 6 and Nishikawa 17, were deduced to express the Cha o 2 isoform of BAF32143. Of the 47 plus trees, 42 were found to be heterozygous for one or more of the nonsynonymous mutations in the Cha o 2 gene.

Cha o 1 allergen polymorphisms in pollen
We next explored Cha o 1 allergen polymorphism in pollen collected from artificial forests located in Chiba and Ibaraki prefectures. DNA sequences of Cha o 1 amplified from pollen DNA were determined by NGS. The genomic DNA sequence of Cha o 1 from the plus tree Miho 6 was sequenced by the Sanger method (LC586088) because no genomic DNA sequence of the Cha o 1 gene was available. The DNA sequence of the exons of LC586088 was found to be identical to that of cDNA reported previously (D45404) [14], and thus the reference sequence used in this report encodes the Cha o 1.0101 isoallergen. The same pollen DNA used for the survey of polymorphism in Cha o 2 was examined. No nonsense mutations were found. No non-synonymous mutation in Cha o 1 was identified in pollen from Chiba and Ibaraki prefectures.

Cha o 1 DNA sequencing results from plus tree needles
The DNA sequence of Cha o 1 was determined by NGS using needle samples from the 47 plus trees. NGS of the needle samples taken from the 47 plus trees detected only one substitution (L262I) in Cha o 1 from the plus tree Shisui 1. Shisui 1 is a possible homozygote for the nonsynonymous mutation because no read without the mutation was detected. The Cha o 1 PCR product from Shisui 1 was 0.6 kb longer than those from other plus trees, which indicates the presence of a 0.6 kb insertion into the Cha o 1 gene from Shisui 1 (Fig 3A and 3B). This possible insertion is located in the 2nd intron because NGS analysis revealed a gap in the 2nd intron of the gene. A no nonsense mutation was detected in the 47 plus trees.

PCR-RFLP of Cha o 1[L262I] from plus trees
We hypothesized that the SNP responsible for L262I in Cha o 1 was linked to the 0.6 kb insertion. In this case, the SNP would be difficult to detect from heterozygotes with the mutation by NGS because a longer sequence is generally not as efficiently amplified by PCR. A PCR-RFLP marker was developed to solve this issue, which detects both alleles with and without the mutation (Fig 3A, 3B). Shisui 1 was found to be homozygous for the mutation from this marker analysis, which is consistent with the results from NGS. The marker analysis confirmed that Miho 6, whose Cha o 1 sequence was identical to Cha o 1.0101, did not include the mutation. An artificial heterozygote for the mutation had both alleles; with and without the mutation (Fig 3B). The PCR-RFLP marker analysis revealed that 8 of the 47 trees in the orchards of Chiba Prefecture were heterozygotes for the SNP (Fig 3C, Table 2). The other 38 trees were found to not express the Cha o 1 with the L262I substitution. The L262I substitution was not detected in Cha o 1 sequences that were derived from pollen samples collected in Chiba and Ibaraki prefectures (data not shown).

Discussion
We identified a single amino acid substitution and 13 amino acid substitutions in Cha o 1 and Cha o 2 allergens, respectively, from pollen collected in Chiba and Ibaraki prefectures and from needles of 47 plus trees located at the seed orchards of Chiba Prefecture.

Homogeneity of Cha o 1
DNA encoding Cha o 1 with the L262I substitution from the plus trees of seed orchards was not detected by NGS or PCR-RFLP in pollen DNA collected in the Chiba and Ibaraki prefectures. However, this result does not exclude the possibility that Cha o 1 from pollen has the L262I substitution. Amplification of the allele coding I262 was probably lower when compared with the amplification of the allele coding L262 because the allele coding I262 yielded a longer PCR product. The NGS and PCR-RFLP data suggested a possible link between the 0.6 kb insertion in the 2nd intron of the Cha o 1 gene and the non-synonymous mutation. The   [25,26], and the average frequency of the allele among the 47 plus trees is only 0.11. Because no substitutions other than L262I were found in Cha o 1 from plus trees and pollen, Cha o 1 derived from pollen collected in Chiba Prefecture may behave uniformly when interacting with its antibodies that target Cha o 1.

Heterogeneity of Cha o 2
In contrast to Cha o 1, we detected 13 amino acid substitutions in Cha o 2. Among the 47 plus trees, no homozygous tree for the allele encoding isoallergen Cha o 2.0101 was detected. Of the 47 plus trees, 42 were heterozygotes for one or more of the 13 substitutions. This indicates that purified Cha o 2 derived from pollen collected in Chiba Prefecture was a heterogeneous allergen based on amino acid sequence analysis. Information on haplotypes of Cha o 2 and their frequencies were not obtained because our analysis used short reads of 150 bp. Out of 13 substitutions, 7 were also detected in pollen collected from 18 trees located in artificial forests in Chiba Prefecture. Furthermore, the read frequencies derived from NGS of five (E29A, A31T, V44L, F52Y, N250D) substitutions found in pollen were similar to the corresponding read frequencies obtained for the 47 plus trees (Fig 2). This is consistent with trees from artificial forests in Chiba Prefecture being likely progenies of the 47 plus trees. Five of the remaining eight substitutions were not detected in pollen from Chiba Prefecture and the read frequencies of H3Q, S274R and E390K differed from the corresponding average frequencies determined for the 47 plus trees (Fig 2). This observation may arise because the collected pollen sample size was small. Thus, collecting and analyzing larger sample sizes should give frequencies that are similar to those observed for the 47 plus trees.
Eight substitutions were detected in pollen Cha o 2 collected from Ibaraki Prefecture, which is located just north of Chiba Prefecture. These eight substitutions are also observed as part of the 13 substitutions found for Cha o 2 of plus trees from seed orchards of Chiba Prefecture. Ibaraki and Chiba prefectures belong to the same breeding district. In the same breeding district, seeds and seedlings can be shared for plantation. Therefore, species identified with substitutions in Cha o 2 may be similar between prefectures of the same district. About 1,000 plus trees of Chamaecyparis obtusa were selected and their progenies were planted throughout Japan, except for Hokkaido and Okinawa. Analysis of allozyme variation in plus trees of Chamaecyparis obtusa in Japan showed that the genetic differences among the breeding districts were smaller to those within the districts [27]. In breeding districts other than Chiba Prefecture belongs to, amino acid substitutions were not surveyed in this report and form a future research endeavor.

Significance of Cha o 2 residue substitutions
Cha o 2 was identified to have 13 substituted residues in its amino acid sequence and some trees had multiple polymorphisms on Cha o 2. Thus, the anti-Cha o 2 antibody may show different affinities toward Cha o 2 collected from different sites or different trees (Table 1). Studies requiring strict amino acid sequences, such as in the analysis of T cell epitope, may be affected by the pollen batch because a single amino acid substitution in the T cell epitope may affect T cell reactivity [28].
Our study provides useful data for developing an ELISA that quantifies Cha o 2 accurately. mAb may fail to recognize its antigen when a single amino acid substitution occurs in its epitope. For example, substitution at residues 55 and 352 of Cry j 1 was found to affect recognition by the monoclonal antibodies (mAbs) [20]. This situation potentially occurs for any allergen purified from natural products. Information on polymorphic amino acid residues in target allergens should facilitate appropriate antibody selection for robust ELISA development. Identification of non-synonymous mutations in advance should enable assessment of the effect of mutation on the affinity of mAbs by using pollen extracts derived from the plus trees. In this work we identified 13 amino acid substitutions in Cha o 2, which suggests that mAbs whose epitopes do not include these 13 residues should be ideally selected for developing ELISA to quantitate Cha o 2 levels.
Determination of the epitope sequence of the mAb is essential for allergy research. Usually synthetic peptide libraries based on the sequence of the isoallergen are used. However, in some cases, ambiguous results such as the presence of multiple positive peptides in distant regions prohibit accurate analysis. Substitutions in epitopes may help explain such cases. As Cha o 2 was found to have 13 substitutions and some of the substitution frequencies were significant, it is possible that an anti-Cha o 2 mAb raised against variants other than the isoallergen Cha o 2.0101 should be taken into consideration when an epitope is examined.
Plus trees expressing Cha o 2 with substitutions may be useful for studying structural epitopes. Pollen from trees with the same mutation can be collected and Cha o 2 with the specific substitution may be purified. The two polymorphic residues F52 and S274 of Cha o 2 are included in the corresponding sequences to the IgE epitopes of Cry j 2 of Japanese cedar pollinosis patients (Fig 4) [29][30][31]. This raises the possibility that the affinity of IgE toward Cha o 2 is dependent on residues 52 and 274. Synthetic peptides are often used for examining sequential epitopes, although this is not always suitable for structural epitopes [29]. For examining structural epitopes of Cha o 2, isoforms of Cha o 2 may become useful tools to characterize structural epitopes. In the present study, we did not examine the effect of substitutions on affinity to IgE from Japanese cedar pollinosis patients. To examine this, purified Cha o 2 with or without the substitution is necessary. Purifying an isoform of Cha o 2 with F52Y or S274R is difficult at present because (1) purification of Cha o 2 requires a considerable amount of pollen material, which is difficult to obtain because the number of plus trees with mutations encoding F52Y and S274R was only two and one, respectively (Table 1), and (2) a non-synonymous mutation encoding S274R was found only as a heterozygote (Nishitama 1). Further surveying of polymorphisms in Cha o 2 of plus trees will be helpful because homozygotes for specific non-synonymous mutations may be detected and multiple trees with the same non-synonymous mutation may be found.

Conclusion
We found 1 and 13 amino acid substitutions in Cha o 1 and Cha o 2, respectively, from plus trees located at nurseries in Chiba Prefecture and pollen collected in Chiba and Ibaraki prefectures. These polymorphisms in Cha o 2 may affect mAb affinity toward Cha o 2 when the antibody against Cha o 2 has been raised against a particular Cha o 2 amino acid sequence where one or more of the 13 substitutions resides in the antibody epitope.